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Summary 

Since the emergence of Middle East respiratory syndrome coronavirus (MERS-CoV) in 
2012, there have been a number of clusters of human-to-human transmission. These 
cases of human-to-human transmission involve close contact and have occurred pri- 
marily in healthcare settings, and they are suspected to result from repeated zoonotic 
introductions. In this study, we sequenced whole MERS-CoV genomes directly from 
respiratory samples collected from 23 confirmed MERS cases in the United Arab 
Emirates (UAE). These samples included cases from three nosocomial and three house- 
hold clusters. The sequences were analysed for changes and relatedness with regard 
to the collected epidemiological data and other available MERS-CoV genomic data. 
Sequence analysis supports the epidemiological data within the clusters, and further, 
suggests that these clusters emerged independently. To understand how and when 
these clusters emerged, respiratory samples were taken from dromedary camels, a 
known host of MERS-CoV, in the same geographic regions as the human clusters. 
Middle East respiratory syndrome coronavirus genomes from six virus-positive ani- 
mals were sequenced, and these genomes were nearly identical to those found in 
human patients from corresponding regions. These data demonstrate a genetic link for 
each of these clusters to a camel and support the hypothesis that human MERS-CoV 


diversity results from multiple zoonotic introductions. 
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1 | INTRODUCTION 


Middle East respiratory syndrome coronavirus (MERS-CoV) was first 
identified in the Kingdom of Saudi Arabia (KSA) in September 2012 
(Zaki, van Boheemen, Bestebroer, Osterhaus, & Fouchier, 2012). It 
is a group C betacoronavirus, distantly related to the severe acute 
respiratory syndrome coronavirus (SARS-CoV) which caused an out- 
break of severe respiratory illness in 2002-2003 (Zaki et al., 2012). 
Middle East respiratory syndrome coronavirus also causes a similar 
acute respiratory illness, and as of December 2017, 2103 cases have 
been confirmed in 27 countries with 733 deaths (35% case fatality 
ratio) (World Health Organization, 2017). Infection has been diag- 
nosed in multiple countries, but all cases have an epidemiologic link 
to the Middle East. 

Like SARS-CoV, MERS-CoV is thought to be of animal origin. 
Investigations of bats and other animals have found near identical se- 
quences, including MERS-CoV in camels (Briese et al., 2014; Corman 
et al., 2014; Hemida et al., 2014; Raj et al., 2014; Reusken, Farag, 
et al., 2014; Reusken, Messadi, et al., 2014). Middle East respiratory 
syndrome coronavirus can replicate efficiently in the upper respiratory 
tract of dromedary camels (referred to hereafter as “camels”) (Adney 
et al., 2014). Further studies identified that the MERS-CoV receptor, 
dipeptidyl peptidase 4 (DPP4), is relatively conserved among mam- 
mals, suggesting a likelihood of cross-species transmission (Raj et al., 
2013). However, only camels and alpacas have been found seropos- 
itive (Reusken, Ababneh, et al., 2013; Reusken et al., 2016), and the 
virus has only been isolated from camels (Hemida et al., 2014; Raj 
et al., 2014; Reusken, Ababneh, et al., 2013). The evidence suggests 
that MERS-CoV may jump between other mammals and camels, and 
that camels play a role as an intermediate host that transmits MERS- 
CoV directly to humans (Al Hammadi et al., 2015; Anthony et al., 2017; 
Azhar et al., 2014; Drosten, Kellam, & Memish, 2014; Memish et al., 
2014; Nowotny & Kolodziejek, 2014; Sabir et al., 2016). 

Based on phylogenetic analysis of MERS-CoV genomes, it appears 
that there have been multiple, independent zoonotic introductions 
of MERS-CoV into the human population, resulting in the observed 
human MERS-CoV diversity (Cotten et al., 2013, 2014). Nosocomial 
transmission has accounted for several clusters associated with multi- 
ple hospitals in the KSA (Cotten et al., 2013, 2014; Memish, Al-Tawfiq, 
& Assiri, 2013). Human-to-human transmission appears to require 
extended close contact with an infected individual. Consequently, 
most of the clusters have occurred in families and healthcare workers. 
Tertiary transmission was observed with the 2015 outbreak of MERS- 
CoV in South Korea (Cho et al., 2016; Park et al., 2015). 

The first half of 2014 saw a large increase in the number of 
MERS-CoV cases, including clusters in hospitals and household set- 
tings in the United Arab Emirates (UAE). These larger outbreaks raise 
the question of whether these outbreak-associated strains have en- 
hanced transmissibility (Drosten, Muth, et al., 2014). SARS-CoV ac- 
quired characteristic genetic changes as the virus was sampled from 
humans during the early epidemic, suggesting that these mutations 
may play a role in either replication fitness or transmissibility (Bolles, 


Donaldson, & Baric, 2011; Chinese Sars Molecular Epidemiology 


Impacts 

e Middle East respiratory syndrome coronavirus (MERS- 
CoV) is an important human pathogen with a high mortal- 
ity rate that emerged from a zoonotic reservoir and has 
been transmitted between humans. 

e This study shows the close genetic relationship between 
the MERS-CoV virus genome sequences transmitted 
within outbreak clusters in the United Arab Emirates, 
which will aid in future epidemiological studies. 

e Camels are an important reservoir of MERS-CoV; in this 
study, MERS-CoVs sampled from UAE camels are se- 
quenced and they demonstrate that the outbreak viruses 


emerged repeatedly from the animal reservoir. 


Consortium, 2004; Song etal., 2005; Wong, Li, Moore, Choe, & 
Farzan, 2004). Genome sequencing of MERS-CoV is critical in under- 
standing molecular determinants of pathogenesis and in understand- 
ing transmission patterns. 

As part of the response to the increased numbers of MERS-CoV 
cases and clusters in and around Abu Dhabi, respiratory samples were 
collected from patients with MERS and contacts along with extensive 
epidemiological data (Hunter et al., 2016). In this study, to understand 
how genomics can help resolve questions of transmission, full MERS- 
CoV genomes from 19 of the 2013-2014 MERS clinical samples were 
sequenced and analysed, along with four additional partial sequences 
(spike and nucleocapsid genes). These cases include patients from 
three hospital-associated clusters, three household-associated clus- 
ters and three sporadic cases from the UAE (Al Hosani et al., 2016; 
Hunter et al., 2016). Additionally, due to the risk factors associated 
with contact with dromedary camels, respiratory samples were col- 
lected from camels at farms near where the human cases originated. 
Full MERS-CoV genomes were sequenced from six of the camel sam- 
ples to better understand the role of animals in these outbreaks and in 


the recent evolution of the virus. 


2 | MATERIALS AND METHODS 


2.1 | MERS-CoV human case clusters and sample 
collection 


A total of 65 patients with MERS-CoV were identified during our 
investigation in the UAE from July 2013 through May 2014. Of 65 
patients, there were six known clusters of human-to-human MERS- 
CoV transmission and other sporadic cases verified by extensive 
epidemiological investigation (Al Hosani et al., 2016; Hunter et al., 
2016). The available respiratory samples from 23 patients analysed 
at the US Centers for Disease Control and Prevention (CDC), and 
potential camel contacts are placed in context in Figures 1 and 
2 and Tables 1 and 2. These samples are from three healthcare- 


associated clusters (HCA I, II and III), three household clusters 
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FIGURE 1. Geographic distribution 
of human and camel cases in the UAE, 


nee 


2013-2014. Map of the United Arab 4B-A. Silas 
Emirates showing approximate location ran 


of each sequenced human MERS case, 

as well as the location of the Middle East 
respiratory syndrome coronavirus (MERS- 
CoV)-positive camels sampled in this study. 
Each marker represents an individual case 
sequenced in this study. Arrows represent 
the importation of cases to the indicated 
location. [Colour figure can be viewed at 
wileyonlinelibrary.com] 
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FIGURE 2 Summary of cases sequenced from the UAE 2013-2014 clusters. Each Middle East respiratory syndrome coronavirus (MERS- 
CoV) genome sequenced from the UAE clusters is shown as a circle. Filled circles represent those human samples where full MERS-CoV genome 
sequence was obtained, stripe-filled circles represent where only S and N gene sequences were obtained. Triangles represent associated 

MERS cases where no sequence was obtained. Cases are plotted along a timeline corresponding with the index case infection date, and those 
associated with frequent exposure to camels are in the shaded region. Camels that were directly implicated in a cluster are connected by a solid 
arrow. Human cases where the sequence indicates that there is significant similarity to a camel are connected by a dashed arrow. [Colour figure 
can be viewed at wileyonlinelibrary.com] 


(HH A, B, C) and two sporadic cases (Table 1, Figure 1). Activities : 
; Revers ie . 2.2 | Camel sample collection 

involved in this investigation were reviewed by CDC and by the 

Health Authority of Abu Dhabi and were determined to be an ur- Nasopharyngeal swabs from dromedary camels were collected in the 
gent public health response that did not constitute human subjects UAE (Figure 1, Table 2) as approved by the CDC Institutional Animal 


research. Care and Use Committee. Three samples (1B-A, 2B-E and 1H-F) were 
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TABLE 1 Summary of 23 Middle East respiratory syndrome coronavirus clinical samples sequenced and submitted to GenBank 
Distance to 
Sample GenBank index 
Cluster Case Type collection date Sequence Type accession (nucleotide) 

Healthcare-associated clusters (HCA) | 2013_002 Index 2013 Jul 10 Genome KY581684 - 

oubhab) 2013003 Secondary 2013 Jul 12 S/N S: KY673146; 0? 
N: KY673143 

2013_004 Secondary 2013 Jul 12 Genome KY581685 1 

Other (Oman) 2013_007 Sporadic 2013 Oct 12 S/N S: KP236092 - 
N: KP236093 

HH A (Abu Dhabi) 2013_008 Index 2013 Nov 24 S/N N: KY673144 - 

2013_009 Secondary 2013 Nov 25 Genome KP209312 oe 

HH B (Dubai/Abu Dhabi) 2013.011 Secondary 2013 Dec 23 Genome KY581687 - 

HCA II (Western Region) 2014 002 Index 2014 Mar 16 Genome KP209310 - 

HH C (Al Ain) 2014008" Index 2014 Apr 09 Genome KP209306 = 

2014 009 Secondary 2014 Apr 10 Genome KY581686 os 

2014_011 Secondary 2014 Apr 10 Genome KY581688 2 

2014_015 Secondary 2014 Apr 10 Genome KY581689 2D) 

HCA III (Al Ain) 2014 008? Index 2014 Apr 09 Genome KP209306 - 

2014 016 Secondary 2014 Apr 12 Genome KP209308 3 

2014 _017 Secondary 2014 Apr 12 Genome KY581690 2 

2014_018 Secondary 2014 Apr 12 Genome KP209307 2 

2014_023 Secondary 2014 Apr 14 Genome KY581691 0 

2014_025 Secondary 2014 Apr 15 Genome KY581692 1 

2014_026 Tertiary 2014 Apr 15 Genome KP209313 1 

2014 030 Secondary 2014 Apr 16 Genome KP209309 0 

2014 033 Tertiary 2014 Apr 20 Genome KP209311 (0) 

2014_045 Tertiary 2014 Apr 25 S/N S: KY673147; 0? 
N: KY673145 

Other (Jeddah, Kingdom of Saudi Arabia) 2014 _ 032 Sporadic 2014 Apr 20 Genome KY581693 - 

Other 2014 XXX Unknown Unknown Genome KY581694 - 


*Sequences compared are not full genomes. 
59014 _008 is the index patient for both HH C and HCA III. 


TABLE 2 Summary of Middle East respiratory syndrome coronavirus genomes sequenced from UAE camels 


Region Sample ID Sample date 
Kingdom of Saudi 1B-A 2014 May 28 
Arabia Border 2B-F? 2014 May 28 
1H-F? 2014 May 28 

Al Ain 3B-C? 2014 Feb 17 
1H-D® 2014 Feb 17 

Western Region 1H-B 2014 Mar 11 


*These sequences are identical to each other. 
>These sequences are identical to each other. 


collected in May 2014 at the border with Saudi Arabia where there 
were no known directly linked human cases. Data on age, sex or clini- 


cal signs were not available. Samples 3B-C and 1H-D were collected 


Closest human case 


2014_XXX 


2014_008, 011, 
030, 033 


2014_002 


Distancefrom human 


(nucleotide) 


8 


GenBank accession 
KY581695 
KY581699 
KY581698 
KY581700 


KY581697 
KY581696 


in February 2014 from two-one-year-old male camel located within 


500 metres from another farm linked with a human case in Al Ain area. 


Sample 1H-B was collected in March 2014 from a 2-month-old male 
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camel which presented with mucopurulent discharge. This male camel 
belonged to a farm located in the Western Region, which was linked 
to human infection reported 10 Mar 2014. 


2.3 | Sanger sequencing and deep 
sequencing analysis 


Middle East respiratory syndrome coronavirus genomes were am- 
plified by 32 pairs of nested, genome-spanning RT-PCRs, in 50 ul 
reactions or in nano-volume reactions using the Fluidigm Access 
Array (van Boheemen et al., 2012; Hunter et al., 2016). When 
sample quality or quantity was too low, the spike and/or nucle- 
ocapsid (S/N) genes were sequenced using alternative nested 
PCR primers. Sanger data were analysed using Sequencher 5.0. 
For high-throughput sequencing, amplicon pools from each sam- 
ple were sheared from 800-1,200 bp to 400-500 bp and were 
used to generate barcoded libraries with the NEBNext Ultra DNA 
library prep kit (NEB, Ipswitch, MA). Sequencing was performed 
using an Illumina MiSeq instrument, multiplexing 5-10 samples per 
2 x 250 bp MiSeq run. 

Next-generation sequencing data were analysed using a cus- 
tom workflow in CLC Genomics Workbench 8.5 (Qiagen, Hilden, 
Germany). Adapters, and an additional 26 bp, were trimmed from 
each end to remove any residual PCR primer sequence. Remaining 
reads were trimmed from the 3’ end using a CLC cumulative quality 
score of 0.05. Trimmed reads were aligned to a reference, and a con- 
sensus sequence was called based on regions that had 10~ or greater 
coverage. 

Variants from the reference sequence comprising at least 5% of 
reads were identified by the quality-based variant detection algorithm 
in CLC Genomics Workbench, using a neighbourhood radius of 5, 
minimum neighbourhood quality score of 25 and a minimum central 


quality score of 29. 


2.4 | Phylogenetic and molecular dating analysis 


The final consensus genome sequences after Sanger and Illumina 
sequencing were aligned with the available complete or near com- 
plete MERS-CoV genomes in GenBank using MUSCLE (Edgar, 
2004). Similarly, Spike gene and protein sequences were aligned. 
Phylogenetic trees were then inferred using the maximum likelihood 
(ML) method available in PHyML version 3.0 (Guindon et al., 2010) 
using a general time-reversible (GTR) model with a discrete gamma- 
distributed rate variation among sites (T,) and a SPR tree-swapping 
algorithm. To construct a time-scaled tree, we first identified and re- 
moved all the recombinant sequences using RDPv4 (Martin, Murrell, 
Golden, Khoosal, & Muhire, 2015), and the remaining full genome 
alignment was then analysed in Beast v1.8.3 (Drummond, Suchard, 
Xie, & Rambaut, 2012), using HKY + Gammad4 substitution model and 
an uncorrelated lognormal relaxed molecular clock. Genome and S 
protein alignment and Single nucleotide polymorphism (SNP) visu- 
alization were performed using Harvest (Treangen, Ondov, Koren, & 
Phillippy, 2014). 


3 | RESULTS 


3.1 | Sequence analysis of six clusters of human-to- 
human transmission in the UAE 


Respiratory samples were collected from confirmed MERS-positive in- 
dividuals from three HCA, three household clusters and three sporadic 
cases from the UAE in 2013 and 2014 (Figures 1 and 2). Using genome- 
walking Sanger sequencing and/or Illumina amplicon sequencing, we were 
able to obtain full genome consensus sequences (30,123 bases) from a 
total of 19 available patient specimens. We obtained S and N gene se- 
quences for an additional four samples that failed full genome sequencing 
(Figure 2, Table 1). Alignment of these sequences with the other known 
MERS-CoV sequences showed >99% genetic identity. Single nucleotide 
polymorphism analysis of the aligned full genome sequences from this 
study against the HCoV-EMC/2012 sequence showed a range of 98 to 
113 nucleotide (nt) variations scattered along the genome (Figure 3a). 
Comparison of the spike genes to the HCoV-EMC/2012 strain showed 
a total of 31 SNPs (Figure 3b), causing nine amino acid (AA) changes, but 
none of these mutations appear to be distinct to these clusters. All se- 
quences generated in this study were deposited in GenBank (Table 1). 

Genetic relatedness of MERS-CoV genomes within each cluster 
supports a close association between proposed transmission partners 
(Figure 2, Table 1). In HCA cluster I, the index case (2013_002) was a 
man who owned a camel farm. The index case genome and the ge- 
nome from one of the contact cases (2013_004) differ by one nucleo- 
tide. The index and another contact (2013_003) have identical S genes 
(genome sequence unavailable). In HH cluster A, case 2013_008 had 
exposure to camels at a camel market. Case 2013_009 was the spouse 
of 2013_008, and the S gene sequences recovered from both cases are 
identical. Interestingly, the genome sequence from patient 2013_009 
clusters with a camel sample collected in Dubai (GenBank KP719927). 
For HH cluster B and HCA cluster II, there was only one specimen 
available from each cluster, so there is no comparative genomic data 
within these clusters. However, case 2013_011, although linked to a 
Dubai case, clustered with the sequences from HH A (2013_009 and 
2013_008) and the related Dubai camel sequence (Figure 4). 

All cases sequenced from both the HCA cluster III and HH cluster 
C are recorded to be cases of direct transmission from the index case 
2014 008, except patients 2014 _026, 2014 _033 and 2014 045, who 
were apparent tertiary transmission cases. All six patients in HCA clus- 
ter Ill with direct contacts to 2014 _008 differ by O-3 nt, compared to 
the index case 2014 _008 (Table 1). According to the contact tracing 
data, case 2014 030 may have been directly or indirectly infected 
by 2014 008 during their overlapping stays in the hospital ward, and 
2014_030 went on to infect 2014 045 and 2014 _033 (Hunter et al., 
2016). Complete virus genomes from patients 2014 008, 2014 030 
and 2014 _033 were sequenced and found to be identical (Figure 3a, 
Table 1). Both S and N sequences from patient 2014_045 were identical 
to those from 2014 _030 (Figure 3b, Table 1). These data are consistent 
with the epidemiological data (Figure 2) (Hunter et al., 2016). The ge- 
nomic data provide clarification regarding the tertiary patient 2014_026. 
The recorded transmission chain (2014 008 > 2014 018 > 2014 026) 
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FIGURE 3 Genome alignment and Single nucleotide polymorphism (SNPs) within UAE cases. (a) Full genome sequences or (b) S gene 
nucleotide sequences from the 2013-2014 UAE Middle East respiratory syndrome coronavirus clusters, sporadic cases and camels were aligned 
with parsnp, using Human betacoronavirus 2c EMC/2012 (GenBank JX869059) as a reference sequence. Gingr was used to visualize SNPs, 
compared to the reference genome. Each vertical bar in the graph represents a single SNP. Asterisks in S gene alignment represent amino acid 


changes 


would require three nt changes, including two reversions, while a dif- 
ferent, but plausible chain (2014_008 > 2014 023 > 2014 026) only re- 
quires one nt change (Table 1). There is insufficient epidemiological data 
to clarify this discrepancy. Genomes from the other three direct contacts 
in HH cluster C to 2014 _008 differ by one-two nt (Table 1). 


3.2 | Minor variant analysis 


We performed a minor variant analysis on the available next- 


generation sequencing (NGS) data to investigate whether or not 


MERS-CoV exists as a diverse population in humans, as it does in cam- 
els (Briese et al., 2014), and to identify relationships between MERS 
cases based on minor variant associations. For the analysis, we ran 
CLC’s quality-based variant detection algorithm using 5% as a con- 
servative cut-off. We observed mixed bases distributed throughout 
the genome in each sample without noting obvious hot spots. This 
analysis shows that at position 11775 in index patient 2014 008, 
bases T and C are present at almost an equal number of reads, coding 
for amino acids as isoleucine (base T) or threonine (base C). However, 


in the HCA cluster III and HH cluster C, only one nucleotide or the 
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FIGURE 4 PhyMLtree analysis 

of Middle East respiratory syndrome 
coronavirus (MERS-CoV) genome 
sequences. Maximum likelihood tree 97 
MERS-CoV genomes, generated using 
PhyML. The trees include sequences from 
the 2013-2014 UAE clusters as well as 
camel-derived viruses sequenced in this 
study and representative sequences from 
GenBank. The lineages described in Sabir 
et al. are indicated. The clusters described 
in the paper are highlighted in coloured 
boxes. [Colour figure can be viewed at 
wileyonlinelibrary.com] 
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other is present in the contact cases. Of the nine direct contact 
cases sequenced, five harbour the C variant and four harbour the T 
variant exclusively. Another notable observation was that secondary 
cases 2014 011 and 2014 017 contain the nonsynonymous muta- 
tion A12891C (ORF1ab E4208A). In the index case 2014_008, 30.6% 
of the bases sequenced at that position are C. This suggests multiple 


possibilities for founder viruses upon transmission. 


3.3 | Phylogenetic analysis of the UAE 
cases and their relatedness to camel MERS- 
CoV genomes 


To understand how the 2013-2014 UAE viruses relate to other known 
outbreak and camel strains, camels were sampled from regions of 
the UAE corresponding to human MERS-CoV cases. We sequenced 
MERS-CoV genomes recovered from three camels (1B-A, 2B-E and 
1H-F) near the KSA border, two camels (3B-C and 1H-D) from a farm 
near Al Ain and one camel (1H-B) from a farm in the Western Region, 
also linked with a known human case (Table 2). The sequences recov- 
ered from these camels were very similar to the human MERS-CoV 
sequences in this study, differing by 3-8 nt, compared to the nearest 
UAE human MERS-CoV sequence (Table 2). Further, we constructed 
maximum likelihood trees on the full genomes and S genes (Figure 4, 
Figure S1). The majority of human and camel MERS-CoVs sequenced 
in this study belongs to lineage 2 as defined by Sabir et al., (2016). 
Sequences for which we only have the S/N sequence (2013_007, 
2013_008, 2013_003 and 2014 045) also cluster within lineage 2 
(Figure $1). Healthcare-associated cluster Ill and HH cluster C form a 
monophyletic group and are closely related to two camel sequences 
3B-C and 1H-D. The rest of the clusters are scattered at different 
positions within the lineage. Among them, HCA cluster | from 2013 
defines an isolated cluster within lineage 2. HH clusters A and B are 
closely related to each other and to one of the Oman cases from 
around the same time period (Figure 4). The sequences from HH 
clusters A and B also cluster with a MERS-CoV genome recovered 
from a camel in the UAE earlier that year (GenBank KP719927). The 
sequence from the HCA cluster II index case (2014 002) is closely 
related to the camel sequence 1H-B, which comes from a farm in the 
Western Region. They differ by three nt which is consistent with their 
potential epidemiologic link (the patient was from the Western Region 
and had contact with camels). While there may be no direct linkage 
among the different small UAE clusters (HCA 1, HH A, HH B and HCA 
Il), HCA cluster II] or HH cluster C, the phylogenetic closeness implies 
a temporal and geographic constraint. The close connection between 
geographic and phylogenetic relatedness is also apparent in the con- 


text of the other known MERS-CoV sequences (Figure 4). 


We also sequenced several sporadic cases of MERS-CoV. The first 
was a UAE case, which was thought to be acquired on travel to Jeddah, 
KSA (2014_032). This genome clusters with the Jeddah 2014 group in 
lineage 4 and is genetically distinct from the other UAE cases (Figure 
4). The second is from a person from Oman who had extensive contact 
with farm animals (2013_007). We sequenced the S gene sequence 
from this virus and found that it clusters near other Omani human 
cases from 2013 as well as several camel MERS-CoV genomes from 
the same time period (Figure $1). Another case (2014_XXX) had no 
available case information, but notably, its genome sequence is phy- 
logenetically linked to the three camel MERS-CoV sequences (seven 
nt difference) which were sampled from the KSA/UAE border in this 
study (1B-A, 2B-E and 1H-F) (Figure 4, Table 2). 

To further define when the outbreak MERS-CoVs emerged with 
respect to other clusters and to the camel viruses, we constructed a 
time-scaled tree using BEAST (Figure 5). Our estimation of evolution 
rate was between 6.5 x 107% and 9.2 x 10% for the entire MERS- 
related genome dataset (recombinants excluded). Under that rate, the 
divergence date between UAE clusters and the Al Hasa cluster was 
most likely before August 2012 (Figure 5). Furthermore, the diver- 
gence date of the genome at Node A, the common ancestor of human 
and camel viruses associated with UAE cluster HCA III and HH C, is 
estimated to be early January 2014 (Node A). Node C represents a 
separate divergence of human 2014 XXX and camel CoVs which oc- 
curred around the same time. Human and camel case-associated vi- 
ruses also are estimated to have diverged at Node B in February 2014 
(Figure 5). Each of these nodes represents a separate introduction of 


camel viruses into the human population. 


4 | DISCUSSION 


Genomic studies of MERS-CoV with accompanying epidemiology are 
important and can reveal spatiotemporal transmission chains to sup- 
port epidemiologic investigations from a zoonotic event with subse- 
quent human-to-human transmission. These studies may also lead to 
improved understanding of the underlying genotypic mutations that 
drive phenotypic changes. Larger transmission events, like those in 
Al Hasa, Jeddah and Korea (Assiri et al., 2013; Oboho et al., 2015; 
Park et al., 2015) provide opportunities for molecular epidemiological 
analysis to understand whether changes in the virus genome lead to 
increased transmission, better fitness or adaptation to treatment. In 
spring of 2014, there was a steep increase in the number of MERS-CoV 
cases reported to WHO. This study demonstrates that these outbreaks 
in the UAE are likely due to independent zoonotic transmission events 
followed by nosocomial amplification which is consistent with the epi- 


demiological investigations (Al Hosani et al., 2016; Hunter et al., 2016). 


FIGURE 5_ BEAST time-scaled tree of Middle East respiratory syndrome coronavirus (MERS-CoV) cases. After filtering recombinant MERS- 
CoV genomes using RDP, the remaining genomes were analysed using BEAST to understand divergence times between the various human cases 
and between the human and camel cases. Node A shows a divergence between a camel virus and the 2014_008-associated cluster (Healthcare- 
associated clusters III and HH C) virus that occurred separately, but concurrent, with the divergence at Node C, between a camel virus and 
human case 2014_XXxX. Likewise, the divergence of the 2014_002 virus from a camel virus happened at Node B, separately and later than the 
other two examples. The clusters described in the paper are highlighted in coloured boxes. [Colour figure can be viewed at wileyonlinelibrary. 
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In the UAE during 2013—2014, there appear to be six independent 
introductions of the virus, emerging from the same lineage (lineage 2) 
(seven, including imported case 2014_032). Unsurprisingly, sequences 
within each human cluster were very close at a nucleotide level, and 
in general, did not vary more than three nucleotides within the same 
transmission chain (Table 1). Further, we did not identify any signa- 
ture amino acid changes in the spike genes associated with the larger 
transmission clusters. Thus, there is no compelling genetic evidence 
for a more transmissible virus, rather, the high rate of transmission was 
likely due to the close contact of the patients with MERS. 

This study provides temporal, geographic and genetic data link- 
ing actual human infections (clusters and sporadic cases) to camel 
MERS-CoV. It is now clear that dromedary camels are a major reser- 
voir of MERS-CoV and an important species in transmitting the virus 
to humans (Alagaili et al., 2014; Azhar et al., 2014; Ferguson & Van 
Kerkhove, 2014). We know that several of the index cases in this 
study had close contact with camels from different regions of the UAE 
(Al Hosani et al., 2016; Hunter et al., 2016). Here, we show genomic 
evidence linking the MERS-CoV genome from camel 1H-B (Western 
Region) to case 2014 _002, a person who visited the farm where this 
animal was kept—the genomes were a distance of three nt. Although 
the camels from KSA/UAE border (1B-A, 2B-E, and 1H-F) were not di- 
rectly associated with any known human cases, the genome sequence 
clustered with human case 2014_XXX at a distance of eight nucleo- 
tides. This suggests a zoonotic link to case 2014 XXX. In HCA clus- 
ter Ill, although there was no recorded camel contact with the index 
patient, sequences collected from the Al Ain-area camels 3B-C and 
1H-C just before the outbreak (February 2014) show close similarity 
at the nucleotide level (seven nt) to the human index case 2014_008 
(Figure 3, Table 2). Additionally, the 2013_002 case, who lived near 
the KSA border and had frequent contact with camels, harboured a 
MERS-CoV sequence closely linked to a camel-derived sequence from 
KSA already published in GenBank (Muhairi et al., 2016). The fact that 
the camel from Dubai (KP719927) is closely related to HH A cases 
suggests that MERS-CoV may have been imported to an Abu Dhabi 
camel market from Dubai. 

The camel- and human-derived MERS-CoV sequences do not 
phylogenetically segregate based on host, rather, animal and human 
viruses are interspersed throughout the trees. This suggests repeated 
independent introductions or cocirculation between camels and 
humans, as has been concluded in other cases (Cotten et al., 2013; 
Drosten, Kellam, et al., 2014; Memish et al., 2014). Taken together 
with the molecular clock data showing different human/camel virus 
divergence dates for different human MERS outbreaks, we conclude 
that the human and camel viruses are not distinct viruses and at least 
some strains are likely capable of infecting both species. Notably, 
human-to-human transmission chains have been relatively short, 
which suggests that MERS-CoV may not be transmitted efficiently 
from human to human at this time, having a transient and mostly dead- 
end infection. Further study of MERS-CoV in camel populations may 
lead to an understanding of the viral genes that are important in tro- 


pism and replication fitness in camel versus human hosts. 


It is becoming more important to understand the contribution of 
viral quasi-species and minor variants in the MERS-CovV life cycle. 
We demonstrated some limited usefulness in linking epidemiologi- 
cal cases, where a minority population converts to the predominant 
population in the transmission partner. This complicates simple 
phylogenetic relationships between cases, as there may be appre- 
ciable diversity in an infected host to transmit different viruses to 
two different contacts. Briese et al. demonstrated this phenom- 
enon in linking a camel MERS-CoV minor population genome di- 
rectly to a human MERS case (Briese et al., 2014). A newer study 
examining camel MERS-CoV isolates observed that although there 
were only a handful of differences between strains at the consen- 
sus level, there were hundreds of intrahost variants (Borucki et al., 
2016). Appreciating this phenomenon will help in more accurately 
and robustly identifying and dating transmission pairs and in under- 
standing variants associated with fitness. 

This study provides genomic information complementing an ep- 
idemiological investigation of a MERS-CoV outbreak and provides 
evidence that the outbreak viruses emerged from camels either di- 
rectly or in the months leading up to the outbreak. The molecular 
data generated here support and clarify the contact tracing records 
and provide contextual information to potential future outbreaks. 
These data will be important in determining what molecular changes 
in the virus may lead to increased transmission between humans. 
Exposure to camels is a risk factor for human MERS-CoV infection, 
and undoubtedly, new cases will continue to emerge. Surveillance 
of camels on the Arabian Peninsula and in eastern Africa has shown 
a high rate of seropositivity as well as some MERS-CoV shed- 
ding (Alagaili et al., 2014; Corman et al., 2014; Haagmans et al., 
2014; Miller et al., 2014; Raj et al., 2014; Reusken, Haagmans, 
et al., 2013). There is a camel MERS-CoV vaccine in development 
(Haagmans et al., 2015; Song et al., 2013), but in the near term, it is 
critical that individuals exposed directly to camels take precautions 
to minimize the risk of MERS-CoV transmission. Broader genetic 
and/or culture-based screening of camel populations is needed to 
understand the prevalence and distribution of coronaviruses and 


other viruses which may cause disease in humans. 
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